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1.  Introduction 


1.1.  Significance  of  diagnostic  problem 

In  the  U.S.  in  1994,  there  were  approximately  182,000  new  cases  and  46,000 
deaths  due  to  breast  cancer,  making  it  second  only  to  lung  cancer  as  the  cause  of  cancer 
death  among  women  [1].  Mammography  is  the  modality  of  choice  for  early  detection  of 
breast  cancer  and  can  significantly  decrease  the  mortality  for  women  undergoing 
screening  [2,3].  Evaluating  mammograms  remains  a  challenging  task  to  radiologists, 
however,  as  they  consider  many  radiographic  and  non-radiographic  features  in  order  to 
decide  whether  a  lesion  is  benign  or  whether  it  should  be  followed  or  biopsied. 
Although  mammography  is  very  sensitive,  there  are  a  large  number  of  false-positive 
biopsies.  Of  women  with  radiographically-suspicious,  nonpalpable  lesions  who  are 
sent  to  biopsy,  only  15  to  34%  actually  have  a  malignancy  by  histologic  diagnosis  [4,5]. 

1.2.  Potential  of  the  proposed  technique 

This  study  seeks  to  improve  the  diagnosis  and  treatment  of  breast  cancer  by 
reducing  the  cost  and  morbidity  of  urmecessary  biopsies.  Cost  is  a  major  obstacle  to 
widespread  acceptance  of  mammographic  screening  [6].  It  has  been  shown  that 
surgeon's  fees  and  biopsy  costs  account  for  over  half  the  cost  of  detecting  small  breast 
cancers  in  a  screening  population  [7].  Preventing  unnecessary  biopsies  is  therefore  one 
of  the  most  important  ways  to  improve  the  efficacy  of  mammographic  screening.  Many 
previous  reports  have  discussed  the  need  to  reduce  the  number  of  benign  biopsies  [8,9]. 

To  improve  early  diagnosis,  we  propose  an  automated  computer-aided  diagnosis 
(CADx)  system  for  mammography.  The  system  will  perform  automated  feature 
extraction  from  mammograms  using  artificial  neural  network  (ANN)  and  other  image 
processing  techniques,  then  predict  the  outcome  of  biopsy  (benign  vs.  malignant).  The 
intent  is  to  identify  probably  benign  lesions  for  which  biopsies  may  be  spared.  This 
study  will  potentially  provide  an  accurate,  consistent  aide  for  the  early  diagnosis  of 
breast  cancer. 

1.3.  Computer-aided  diagnosis  using  artificial  neural  networks 

In  medical  imaging,  CADx  systems  provide  radiologists  with  information  from 
computerized  analysis  of  images  or  image  features,  thus  helping  radiologists  detect  or 
diagnose  diseases  more  accurately,  easily,  and  consistently  [10,11].  In  mammography, 
there  have  been  numerous  reports  on  computerized  detection  [12-18]  or  diagnosis  [19- 
24]  of  breast  cancer.  Although  both  are  generally  considered  CADx  systems,  detection 
systems  locate  suspicious  lesions  in  an  image,  while  diagnosis  systems  such  as  this 
study  determine  whether  those  lesions  are  benign  or  malignant. 

This  study  focuses  on  the  use  of  artificial  neural  networks  (ANNs)  which  are 
computer  models  inspired  by  the  structure  and  function  of  biological  neural  networks. 
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such  as  the  cerebral  cortex  of  the  human  brain.  Most  ANNs  are  characterized  by 
multiple,  simple  computing  elements  or  neurons  that  work  in  parallel.  The  neurons 
interact  globally  through  cormections  that  have  strengths  or  weights,  and  together  they 
can  duplicate  aspects  of  human  intelligence  while  incorporating  the  processing  power 
of  computers  [25].  The  classification  rules  are  not  defined  a  priori,  instead  the  network  is 
trained  by  presenting  it  with  medical  findings  and  final  diagnoses  from  many  patients. 
The  network  "learns"  by  adapting  its  weights  to  improve  its  diagnosis  for  each  patient, 
just  as  physicians  become  more  experienced  with  time.  Once  trained,  the  network  can 
generalize  to  new  patients  it  has  not  seen  before. 

ANNs  are  very  useful  in  handling  complex  decision  tasks  such  as  those  involved 
in  medical  diagnoses,  where  multiple  findings  are  subtly  related  in  ways  which  are 
often  difficult  to  express  in  the  form  of  diagnostic  criteria.  The  networks  can  capture 
such  relationships  between  the  input  findings  to  generate  robust  outputs.  ANNs  solve 
problems  empirically  without  requiring  any  prior  knowledge  of  distribution  functions 
or  any  type  of  statistical  modeling,  yet  ANNs  are  able  to  duplicate  solutions  of  statistical 
methods  [26].  Finally,  ANNs  are  always  consistent,  for  they  are  not  prone  to  human 
fatigue  or  bias. 

1.4.  Summary  of  progress  from  previous  budget  period. 

During  the  first  budget  period,  our  institute  adopted  the  Breast  Imaging 
Reporting  and  Data  System  or  BI-RADS  lexicon,  which  was  endorsed  by  the  American 
College  of  Radiology  to  improve  upon  the  consistency  of  mammographic 
reports  [27,28].  The  use  of  BI-RADS  descriptors  would  allow  the  techniques  developed 
in  this  study  to  be  used  in  all  institutions  that  adopt  this  standardized  system.  We 
investigated  the  use  of  this  new  lexicon  to  take  advantage  of  its  potential  for  general 
applicability  of  the  CADx  system.  This  work  was  presented  at  a  national  conference 
[29]  and  subsequently  published  in  two  parts  in  a  peer-reviewed  journal  [30,31].  The 
ANN  was  developed  using  206  patients  who  underwent  excisional  biopsy  and 
pathologic  diagnosis.  The  ANN  was  evaluated  by  receiver  operating  characteristic 
(ROC)  analysis  and  its  performance  was  compared  to  that  of  expert  mammographers. 
That  study  was  then  extended  by  identifying  an  optimal  subset  of  input  features  to 
simplify  the  network.  This  work  was  presented  at  two  national  conferences  [32,33]  and 
subsequently  published  in  a  peer-reviewed  journal  [34]. 
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1.5.  Technical  Objectives 

The  technical  objectives  pertaining  to  the  second  budget  period  are  aims  2a  and  2b 
from  the  list  of  aims  for  the  entire  budget  period  shown  below: 

(1)  Identify  an  optimal  subset  of  features  that  would  provide  adequate  diagnostic 

performance. 

la.  Retrain  the  features-to-diagnosis  ANN  using  sub-groups  of  features.  The 
goal  is  to  maintain  the  sensitivity  of  the  original  network  while  keeping 
specificity  reasonably  high. 

lb.  Encode  the  multiple-value  features  into  binary  "sub-features",  then  repeat 
step  la  to  reduce  the  number  of  sub-features.  The  sub-features  will  be  easier 
to  extract  by  automated  schemes. 

(2)  Investigate  conventional  and  ANN  methods  for  extracting  the  optimal  subset  of 

features  directly  from  mammograms. 

2a.  Implement  established  techniques  which  have  demonstrated  promise  for 
extracting  features  belonging  to  our  reduced  feature  set. 

2b.  Investigate  several  ANN  techniques  for  feature  extraction,  focusing  on 
features  which  may  be  difficult  to  classify  by  conventional  techniques  in 
step  2a.  For  both  2a  and  2b,  evaluate  these  techniques  by  comparing  the 
extracted  features  against  radiologists'  findings. 

(3)  Evaluate  the  automated  CAD  system  clinically. 

3a.  Implement  the  CAD  system  by  feeding  the  best  feature  extraction  techniques 
from  step  2  into  the  best  features-to-diagnosis  ANN  from  step  1,  and  compare 
the  resulting  diagnosis  against  the  biopsy  result. 

3b.  Evaluate  the  accuracy  of  the  CAD  system  retrospectively  by  using  patient 
records  from  our  computerized  mammography  database. 


Ti,  , . . . . . . . . . - . . . . . . . ...i 

month  6  12  18  24  30  36 

Figure  1.  Time  line  for  proposal  project  period. 

In  the  following  sections,  we  will  report  in  detail  on  the  progress  in  aim  2a  and 


2b. 
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2.  Body _ 

The  original  hypothesis  of  this  proposal  was  to  identify  a  small  number  of 
important  radiologist-extracted  mammographic  findings,  then  attempt  to  extract  those 
findings  using  artificial  neural  network  (ANN)  or  classic  image  processing  techniques. 
Our  work  during  the  current,  second  budget  period  has  revealed  two  major  new 
discoveries  which  affect  that  hypothesis.  First,  we  discovered  significant  new  potential 
in  utilizing  radiologist-extracted  findings  to  predict  malignancy  as  well  invasion  among 
breast  lesions.  Second,  our  initial  attempts  at  feature  extraction  did  not  yield  the  level  of 
performance  required  for  an  accurate,  automated  computer-aided  diagnosis  system. 

For  these  reasons,  we  have  increased  the  emphasis  on  using  radiologist-extracted 
findings  to  provide  the  best  possible  diagnosis  of  breast  cancer. 

In  the  following  sections,  we  will  describe  four  major  studies  undertaken  during 
the  second  budget  period: 

2.1)  We  used  the  radiologist's  impression  as  an  input  feature  to  the  ANN.  This 
work  was  presented  at  the  Radiological  Society  of  North  America  (RSNA) 

'95  annual  meeting  and  published  in  the  proceedings  issue  of  Radiology  [35]. 

2.2)  We  used  global  thresholding  to  extract  the  mass  margin  feature,  and  also 
presented  this  work  at  RSNA  '95  for  publication  in  the  proceedings  [36]. 

2.3)  We  studied  the  error  surfaces  of  ANNs  using  the  optimized  subset  of 
findings  identified  in  the  first  budget  period.  This  work  was  presented  at 
the  World  Congress  of  Neural  Networks  (WCNN)  '96  conference  [37]. 

2.4)  We  explored  the  feasibility  of  extending  the  original  ANN  which  used 
radiologist-extracted  findings  to  predict  breast  lesion  malignancy  so  that  it 
would  also  predict  whether  malignant  lesions  are  invasive  or  in  situ 
carcinoma.  This  work  was  presented  at  the  International  Society  for  Optical 
Engineering  SPIE  Medical  Imaging  1996  conference  [38]  and  was  accepted 
for  publication  in  Radiology  [39]. 

The  first  two  studies  were  undertaken  to  address  specific  aim  2a,  while  the 
second  two  studies  address  specific  aim  2b,  with  our  new  emphasis  on  radiologist- 
extracted  findings. 

2.1.  ANN  incorporating  radiologist  impression  as  an  input. 

The  purpose  of  this  study  was  to  develop  an  ANN  as  a  diagnostic  aide  in 
mammography,  predicting  breast  lesion  malignancy  based  on  the  radiologist 
impression  and  an  optimized  subset  of  BI-RADS^^  radiographic  features.  Until  now  all 
CADx  studies  follow  one  of  two  paradigms,  either  pitting  ANN  output  vs.  radiologist 
diagnosis,  or  encouraging  the  radiologist  to  incorporate  the  ANN  output  into  her  final 
diagnosis.  This  study  explores  a  novel  option  in  which  the  ANN  considers  the 
radiologist  diagnosis  as  an  input  finding  along  with  the  optimized  subset  of  BI-RADS 
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findings  previously  identified.  The  use  of  the  radiologist  impression  as  an  input  was 
justified  since  the  network  was  intended  to  assist  the  radiologist  in  making  a  diagnosis. 
Since  the  radiologist  impression  is  based  on  the  human  expert's  consideration  of  the 
mammograms,  clinical  findings,  and  general  experience,  we  hypothesized  that  it  may 
provide  important  diagnostic  information  for  the  ANN. 

A  3-layer  backpropagation  ANN  was  developed  to  predict  the  outcome  of  biopsy 
using  only  4  features:  3  from  the  BI-RADS  lexicon  (mass  margin,  calcification 
description,  and  age)  plus  the  radiologist  impression.  This  ANN  architecture  and  the 
network  training  algorithm  have  been  described  in  detail  in  the  proposal.  The  choice  of 
those  three  particular  BI-RADS  findings  was  arrived  at  after  an  optimized  reduction  of 
the  number  of  input  features,  as  described  in  the  previous  year's  progress  report.  Using 
the  round  robin  or  leave-one-out  technique  with  206  patients,  network  performance  was 
evaluated  by  Az,  the  receiver  operating  characteristic  (ROC)  area  index. 

We  found  that  given  age  and  the  2  radiographic  features  but  not  the  radiologist 
impression,  the  ANN  performed  with  Az=0.83,  which  was  not  significantly  worse  than 
the  expert  mammographer's  Az=0.85.  Given  the  additional  input  of  the  radiologist 
impression,  the  network's  Az=0.89  was  significantly  better  than  the  radiologists' 
performance,  with  p=0.07.  In  figure.  2,  the  ROC  curves  for  the  4-feature  ANN  given  the 
radiologist  impression  and  the  radiologist  impression  itself  are  compared. 


Fig.  2.  ROC  curves  of  4-feature  ANN  incorporating  the  radiologist 
impression  versus  the  radiologist  impression  itself.  The  ANN  out¬ 
performed  radiologists  with  p=0.07. 

In  conclusion,  a  diagnostic  aide  was  developed  for  mammography  that 
accurately  predicted  malignancy  given  only  4  input  features,  including  the  radiologist 
impression.  By  taking  advantage  of  the  radiologists'  considerable  expertise,  the 
combined  system  was  more  accurate  than  radiologists  alone. 
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2.2.  Extraction  of  mass  margin  by  global  thresholding 

In  the  first  budget  period  of  this  proposal,  we  have  already  demonstrated  ANNs 
which  predict  breast  mass  malignancy  based  on  only  patient  age  and  the  mass  margin 
finding  extracted  by  radiologists.  The  purpose  of  this  study  was  to  attempt  to  replace 
this  very  important  mass  margin  finding  with  one  or  two  computer-extracted  features. 

For  this  study,  41  mammograms  with  biopsy-proven  masses  were  randomly 
selected.  The  mammograms  were  digitized  to  100  micron  per  pixel  resolution,  and  a 
512  by  512  pixel  region  of  interest  (ROI)  centered  at  the  mass  was  extracted.  To  facilitate 
the  extraction  of  the  boundary  of  each  mass,  the  background  trend  in  the  ROI  was  fitted 
to  a  second-order  polynomial  and  subtracted,  and  the  ROI  was  further  median  filtered 
to  reduce  noise.  Starting  from  the  center  of  the  mass,  possible  mass  boundaries  were 
identified  using  a  combination  of  region  growing  and  global  thresholding  techniques. 
The  results  from  each  iteration  are  displayed  to  the  user  as  a  false  color  image.  The  user 
determines  which  color-coded  threshold  most  closely  approximates  the  boimdaries  of 
the  mass.  In  other  words,  the  final  mass  boundary  is  manually  selected  from  many 
automatically  generated  candidates. 

Given  the  mass  boundary,  the  irregularity  and  circularity  are  calculated  in  a 
straightforward  manner.  The  irregularity  is  defined  as  the  ratio  of  the  perimeter  of  the 
mass  to  that  of  a  circle  of  equal  area.  The  circularity  is  defined  as  the  fraction  of  overlap 
in  area  between  the  mass  and  a  centered  circle  of  equal  area.  The  hypothesis  is  that 
well-circumscribed  and  thus  probably  benign  masses  will  be  characterized  by  low 
irregularity  and  high  circularity,  while  the  opposite  trends  will  apply  to  spiculated 
masses. 

We  found  the  ANN  performed  reasonably  well  with  Az  of  0.82  ±  0.07  when  given 
the  two  findings  of  patient  age  and  irregularity.  This  performance  was  somewhat 
improved  to  Az  of  0.89  ±  0.06  when  the  ANN  was  given  circularity  as  an  additional, 
third  input  feature.  Both  of  these  ANNs  were  much  worse,  however,  when  compared 
to  an  ANN  based  on  age  and  the  radiologist-extracted  mass  margin,  which  performed 
with  Az  of  0.96  ±  0.03.  It  should  be  noted  that  for  this  limited  sample  of  masses,  the 
expert  radiologists'  impressions  distinguished  benign  from  malignant  masses  perfectly 
with  Az  of  1.0. 

These  results  were  not  very  successful  for  several  reasons.  Unlike  our  previous 
studies,  these  ANNs  did  not  match  or  outperform  the  radiologists  that  they  are 
intended  to  assist.  Since  our  expert  radiologists  already  diagnose  masses  with  very 
high  accuracy,  there  is  in  fact  very  little  room  for  improvement.  Finally,  there  was 
relatively  low  correlation  between  the  findings  extracted  by  radiologists  vs.  computer. 
We  have  initiated  attempts  to  improve  these  results  by  using  more  sophisticated 
measures  of  texture  and  shape  such  as  fractal  dimension  analysis.  We  have  also 
increased  the  number  of  digitized  mammograms  from  41  to  100  to  reduce  the  large 
standard  of  deviation  associated  with  the  ROC  areas. 
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2.3.  Error  surfaces  of  simplified  ANN. 

The  purpose  of  this  study  was  to  investigate  the  underlying  behavior  of  an 
artificial  neural  network  (ANN)  for  computer-aided  diagnosis  in  mammography.  A 
single-layer  perceptron  was  developed  to  predict  whether  masses  were  benign  or 
malignant,  based  only  on  the  patient  age  and  the  mass  margin  finding  characterized  by 
radiologists.  The  performance  of  this  very  simple  ANN  was  comparable  to  much  more 
complex  ANNs  which  required  many  more  features.  The  network's  three-dimensional 
error  surfaces  were  visualized  by  independently  varying  the  perceptron's  two  weights 
and  bias  value,  while  measuring  performance  by  mean  squared  error  and  RCXl  area 
index.  This  study  provided  a  imique  opportimity  to  study  the  underlying  behavior  of 
an  actual  ANN  medical  application,  which  will  assist  in  development  of  other  ANNs  for 
computer-aided  diagnosis  in  mammography. 

From  266  randomly  selected  patients  who  underwent  biopsy,  138  cases  had 
masses.  Expert  radiologists  characterized  the  mass  margin  as  1  of  5  categories  (in 
increasing  order  of  suspicion:  well  circumscribed,  microlobulated,  obscured,  indistinct, 
or  spiculated)  and  the  patient  age  was  recorded.  The  round  robin  or  "leave  one  out" 
data  sampling  technique  was  used  as  before.  The  network  employed  was  a  single-layer 
perceptron  with  only  2  inputs,  the  mass  margin  and  patient  age,  and  a  bias  term. 


Fig  3.  Feature  space:  input  patterns  vs.  network  outputs 

The  input  patterns  are  superimposed  over  a  contour  plot  of  the  trained  network's 
outputs  over  the  entire  range  of  the  2  inputs,  mass  margin  and  patient  age.  The  patterns 
belong  to  1  of  2  classes,  benign  or  malignant,  shown  as  circles  and  x  marks  respectively. 
Malignancies  tend  to  occur  for  older  patients  with  high  mass  margin  values  (upper 
right  portion  of  graph).  The  2-input  perceptron  with  sigmoidal  thresholding  generated 
continuous,  linear  h5q)erplanes  depicted  as  contours.  Since  the  2  pattern  classes 
overlap,  it  is  not  possible  to  separate  them  by  any  hyperplane ,  so  any  chosen  threshold 
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is  necessarily  a  trade-off  between  sensitivity  and  specificity.  Lower  thresholds  of  the 
network  ou^uts  would  yield  higher  sensitivity  (detecting  more  cancers),  while  higher 
thresholds  would  produce  higher  specificity  (fewer  false  positive  biopsies).  Usually  low 
thresholds  are  chosen  to  ensure  cancer  detection. 


Figs.  4a /4b  visualize  the  network's  error  surfaces  in  3-dimensional  weight  space  (1 
weight  each  for  the  mass  margin,  age,  and  bias).  Performance  was  evaluated  by  doing  a 
grid  search  over  a  range  of  weights  and  plotting  the  testing  RMSE  and  Az  against  all  3 
combinations  of  2  weights  at  a  time.  The  best  weights  are  indicated  with  white  arrows 
in  each  plot,  and  for  comparison  the  best  weights  from  the  opposite  plots  are  shown  in 
gray.  Note  the  striking  differences  in  both  the  coordinates  for  the  best  weights  as  well 
as  the  underlying  error  surfaces.  This  distinction  is  important  because  most  computer- 
aided  diagnosis  applications  are  evaluated  in  terms  of  Az  rather  than  MSB. 

Fig  4a.  Error  surfaces  in  weight  space  (testing  RMSE) 
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Since  the  perceptron  minimized  training  MSB,  not  surprisingly  the  testing  RMSE 
surfaces  were  also  well  behaved  with  no  local  minima  and  an  obvious  global  minimum 
(shown  as  trenches  of  deep  gray).  In  comparison,  the  Az  plots  revealed  a  large, 
irregularly  shaped  global  maximum.  Because  the  perceptron  convergence  algorithm 
optimized  MSB  rather  than  Az,  the  white  arrows  indicating  the  weights  for  the  best  Az 
were  actually  quite  far  from  the  actual  global  maximum.  Fortunately,  Az  performance 
was  very  good  over  a  wide  range  of  weights,  so  the  suboptimal  network  solution  still 
performed  nearly  as  well  as  actual  global  maximiim.  This  study  demonstrated  that  it  is 
important  to  be  aware  that  optimizing  ANNs  by  MSB  may  not  necessarily  result  in 
optimization  of  the  Az. 

Fig  4b.  Error  surfaces  in  weight  space  (Az) 
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2.4.  Predicting  invasion  of  breast  cancers. 

The  purpose  of  this  study  was  to  develop  an  ANN  to  predict  breast  cancer 
invasion  based  on  BI-RADS  mammographic  findings  and  age.  For  patients  classified  as 
having  invasive  breast  cancer,  excisional  biopsy  may  be  obviated  by  obtaining 
histologic  confirmation  via  stereotaxic  needle  core  biopsy,  and  the  patients  may  then 
imdergo  a  single-stage  surgery  for  mastectomy  and/ or  axillary  dissection. 

The  three  studies  described  above  focus  on  obviating  benign  biopsies. 
Additionally,  as  many  as  80%  of  biopsied  malignancies  are  invasive  [40].  Traditionally, 
these  patients  require  a  diagnostic  excisional  biopsy/lumpectomy,  followed  by  the 
second,  therapeutic  surgical  procedure  of  mastectomy  and/or  axillary  dissection.  For 
these  invasive  cancers,  stereotaxic  biopsy  has  also  been  proposed  to  provide  histologic 
diagnosis  in  lieu  of  excisional  biopsy,  so  that  the  patients  may  undergo  a  single-stage 
therapeutic  surgical  procedure  for  the  mastectomy  and/ or  axillary  dissection  [41,42]. 
Compared  to  excisional  and  stereotaxic  biopsy,  the  current  study  proposes  an  artificial 
neural  network  (ANN)  computer  model  to  provide  similarly  accurate  diagnosis  while 
being  completely  noninvasive  and  involving  no  surgical  procedures.  This  ANN  can 
assist  radiologists  and  surgeons  in  predicting  invasion  among  nonpalpable, 
mammographically  suspect  lesions. 

As  before,  266  biopsied  lesions  were  randomly  selected  (96  malignant,  170 
benign).  Based  on  9  BI-RADS  mammographic  findings  and  patient  age,  a  3-layer 
backpropagation  network  was  developed  to  predict  whether  the  96  malignant  lesions 
were  in  situ  or  invasive.  Performance  was  measured  by  Az  using  the  round  robin 
sampling  technique  as  before. 

Using  the  96  biopsy-proven  malignant  cases,  the  network  distinguished  between 
invasive  and  in  situ  cancers  very  well  with  Az  of  0.91  ±  0.03.  The  ROC  curve  for  this 
network  is  shown  in  Fig.  5,  and  the  histogram  of  neural  network  outputs  for  all  cases  is 
shown  in  Fig.  6,  both  on  the  following  page. 
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False  positive  fraction 


Figure  5.  ROC  curve  for  ANN  predicting  invasion. 

The  area  under  the  curve,  Az,  of  0.91  indicated  that  the  network  predicted 
invasion  with  a  very  high  degree  of  accuracy. 


Figure  6.  Histogram  of  ANN  outputs  for  malignant  cases  only 

Note  the  threshold  denoted  by  the  dashed  line.  Outputs  for  all  in  situ 
cancers  were  below  that  threshold  and  thus  correctly  classified  (100% 
specificity),  while  48  of  68  invasive  cancers  were  above  the  threshold  and 
thus  also  correctly  classified  (71%  sensitivity). 
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The  current  study  demonstrated  that  BI-RADS  mammographic  features  and 
patient  age  could  be  used  to  develop  an  artificial  neural  network  that  distinguishes 
between  in  situ  versus  invasive  carcinoma.  For  patients  similar  to  those  considered  in 
this  study,  the  network  would  correctly  identify  100%  of  in  situ  cancers  and  71%  of 
invasive  cancers. 

Compared  to  previous  studies,  this  work  is  unique  and  important  in  several 
respects.  By  using  the  BI-RADS  lexicon  to  encode  morphological  features,  the  neural 
networks  developed  in  this  study  should  be  applicable  to  other  institutions  which  have 
adopted  this  standard.  Moreover,  this  study  was  the  first  to  develop  a  multivariate 
predictive  model  using  readily  available  medical  findings,  i.e.,  BI-RADS 
mammographic  features  and  patient  age,  to  accurately  classify  invasion  among  breast 
cancers.  By  providing  information  which  was  previously  available  only  through 
biopsy,  the  artificial  neural  network  may  assist  in  surgical  planning  for  patients  with 
breast  lesions,  and  may  reduce  the  cost  and  morbidity  of  "unnecessary"  surgical 
biopsies. 


3.  Conclusions 


This  goal  of  this  proposal  is  to  develop  a  computer-aided  diagnosis  system  to 
automatically  extract  radiographic  features  from  the  mammogram,  then  use  an  artificial 
neural  network  (ANN)  to  merge  those  features  to  predict  breast  lesion  malignancy. 
During  the  first  budget  period,  we  successfully  developed  an  ANN  that  merges 
radiologist-extracted  features  to  predict  malignancy.  We  also  identified  an  optimal 
subset  of  input  features  while  maintaining  diagnostic  accuracy. 

During  the  current,  second  budget  period,  we  accomplished  four  studies.  In 
accordance  with  specific  aim  2a,  we  improved  the  performance  of  the  ANN  using  the 
optimized  subset  of  findings  by  incorporating  radiologist  impression  as  an  additional 
input  finding,  as  summarized  in  section  2.1.  above.  In  section  2.2,  we  described  a  semi- 
automated  technique  using  classic  image  processing  techniques  to  extract  and 
characterize  the  boundary  of  breast  masses,  and  developed  ANNs  using  those 
boxmdary  findings.  In  accordance  with  specific  aim  2b,  we  explored  the  use  of  ANN 
techniques  for  computer-aided  diagnosis  of  breast  cancer.  In  section  2.3,  we  studied  the 
imderlying  behavior  of  these  networks  by  examining  their  error  surfaces  in  weight 
space.  As  discussed  in  the  preamble  to  section  2,  due  to  the  less  promising  results  of  the 
automated  feature  extraction  approach  coupled  with  the  highly  promising  results  from 
using  radiologist-extracted  BI-RADS  findings,  we  increased  our  emphasis  on  the  latter 
approach.  Specifically  in  section  2.4,  we  developed  an  ANN  to  predict  invasion  among 
breast  malignancies.  Together,  these  studies  provide  important  new  discoveries  which 
are  crucial  for  a  complete  system  for  the  computer-aided  diagnosis  of  breast  cancer. 
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